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Figure 1 Overall process used in region extraction 

3 Input Image Data 

The digital input images are assumed to be in YUV format. If the inputs are in a 
chrominance sub-sampled format such as 420, 411 or 422, the chrominance data is 
upsampled to generate 444 material. 
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4 Feature Vector Generation 

We extract one feature vector for each PxQ block of the input picture. There are two 
stages in the feature vector generation process. In the first stage, we transform the data 
fi*om the original YUV color co-ordinate system into another co-ordinate system known 
as CIE - L*a*b* [see Fundamentals of Digital Image Processing, by Anil K. Jain, 
Prentice-Hall, Section 3.9]. The latter is known to be a perceptually uniform color 
system, i.e. the Euchdean distance between two points (or colors) in the CIE - I^a*b* co- 
ordinate system corresponds to the perceptual difference between the colors. 

The next stage in the featiu-e vector generation process is the calculation of the first N 
moments of the CIE - I^a'^b* data in each block. Thus, each feature vector has 3N 
corrponents (N moments in Z, N moments in a, and N moments in b). We can denote the 
(3Nxl) feature vector of the (/, y)-th block of the input picture as follows. 

where the k-th moment in, say, the L conq)onent, is given by 
where {x, y) represents the index of a point in the (/, 7)-th block. 
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5 Gradient Extraction 

The next stage in our region extraction process is that of gradient extraction. We will 
estimate a block-based gradient field for the input picture (i.e. we get one scalar gradient 
value for each PxQ block of the input picture). The gradient at the {i y)-th block of the 
input picture is defined as the maximum of the distances between the block's feature 

vector and its nearest neighbor's feature vectors. 

gradiUj) = max {d(f{Uj)J(i - ^,7 - /)]} . 

A^e{-l,0,l} 

where d[.,.] is function that assigns a distance value to a pair of feature vectors. (Note: in 
the above maximization, we let k and / each vary from— 1 to +1, but do not allow k = I = 
0 simultaneously! Also, along the borders of the image, we consider only those 
neighboring blocks that lie inside the image boundaries). In our work, we will employ 
two types of distance functions. 

We could use other methods to select the gradient value from the above set of distances, 
for example the minimum, median, etc. We need to evaluate the performance of the 
segmentation algorithm when such methods are used. 

5. 1 Weighted Euclidean Distance Metric 

Here, the distance function d[.,.] is simply the weighted Euclidean distance between the 
two vectors. 



:^l(a'«l./-.'«u)' ^'"^a^NL^NJ-a^N^gf + > wherC 

In the above formula, the weighting factors { w^^ } can be used to accoimt for the 

differences in scale among the various moments. This metric is very easy to implement. 
In our implementation, we set N = 1, i.e. use only the mean values within each PxQ 
block, and set the weighting factors to unity (this makes sense, since the CIE -L*a*b* 
space is perceptually uniform). 



5-2 Probability Mass Function Based Distance Metric 

The second choice of the distance metric is a little more involved. Here, we exploit the 
fact that using the nioments of the data within the PxQ block, we can confute an 
approximation to the probability mass function (pmf) of that data. The pmf essentially 
describes the distribution of the data to be coiT5)Osed of a mixture of several values 
^o»^M^2'-*M with respective probabilities Pq^P^.P-^,,.. The values and the probabilities 
together constitute the pmf We can confute these values using the moments as follows. 
For ease of notation, we will drop the subscripts Z, a, and b, because the equations that 
we provide apply to all three color con^jonents. 
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Initially, we approximate the distribution as a mixture of two values, Vq and Vj,with 
probabilities PQ2ind respectively. We use the moments-based approach given in Ali 
Tabatabai's Ph.D. thesis to estimate the values Vq , Vj, Poand . In this method, we need 
the first three moments of the data (i.e. N = 3): 

where y) are data values in the (i,j)-th block. Then, 

1 + SJ r 

\4 + S^ 




= -cr^rVp ,and 



0 



= -f cr^^^/^, where 



^^m3+2/nf -3^,^, ^^^^ 



Thus, we can convert the moment-based feature vector of each PxQ block into a pmf- 
based representation. Once we have such a representation, then the distance between two 
feature vectors can be confuted via the distance between the two pmf s. For this, we 
make use of the Kohnogorov-Smirnoff (K-S) test, as described in Section 14.3 of 
''Numerical Recipes in C\ 2°** edition, by W. A. Press, S. A. Teukolsky, W. T. 
Vetterling, and B. P. Flannery, Cambridge University Press. (Essentially, the distance 
between two pmTs is the area under the absolute value of the difference between the two 
cxxmulative distribution functions, see the above-mentioned chapter for details). 

Though the K-S test is prescribed for pmTs of a single variable, the data we have is in 
fact three-dimensional (£, a, and b conponents). Strictly speaking, we need to con:5}ute 
the joint, three-dimensional pmf, and then con5)ute a distance between two pmTs. This is 
however a very hard problem to solve, and instead, we make a simplifying assuii5)tion. 
We assimae that the color data in a PxQ block can be modeled by means of three 
independent pmf s, one each for the a, and b conponents. Let us denote these pmf s by 
pmf^ , pmf^ , and pmf^ respectively. Also, denote the K-S distance measure between 
two pmfs by d^^ (.,.) , then, the overall distance metric is given by 

dUiU j\ g{K l)] = d^ {pmfj^j , pmf^^ ) 4- {pmf^j , pmf^^^ ) + d^ ipmf^j , pmf^^^ ) . 
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